A practical methodology for defining histograms for predictions and scheduling
نویسنده
چکیده
Current distributed parallel platforms can provide the resources required to execute a scientific application efficiently. However, when these platforms are shared by multiple users, the performance of the applications using the system may be impacted in dynamic and often unpredictable ways. Performance prediction becomes increasingly difficult due to this dynamic behavior. Even performance modeling techniques that are built specifically for distributed parallel systems often require parameterization by single (point) values. However, in shared environments, point values may provide an inaccurate representation of application behavior due to variations in resource performance. This paper address the use of practical histogram stochastic values to parameterize performance models. Whereas a point value provides a single value representation of a quantity, a stochastic value provides a set of possible values to represent a range of likely behavior. In previous work we investigated using either normal distributions or an upper and lower bound to represent stochastic data. In this work we examine using a combination of those methods, namely histograms, to represent this data. We define a practical approach to using histograms in a production setting, and then give experimental results for a set of applications under different load conditions. 1 Motivation Current distributed parallel platforms can provide the resources required to execute a scientific application efficiently. However, when these platforms are shared by multiple users, the performance of the applications using the system may be impacted in dynamic and often unpredictable ways. In order to obtain good performance, accurate performance prediction models for distributed parallel systems are needed. Most performance prediction models use parameters to describe system and application characteristics such as bandwidth, available CPU, message size, operation counts, etc. Model parameters are generally represented as a single likely value, which we refer to as a point value. For example, a point value for bandwidth might be 7 Mbits/second. In practice, point values are often a best guess, an estimate under ideal circumstances, or a value that is accurate only for a given timeframe. In some situations it may be more accurate to represent system and application characteristics as a range of possible values; for example, bandwidth might be reported as varying between 6 and 8 Mbits/second. We refer to such values as stochastic values. Whereas a point value gives a single value for a quantity, a stochastic value gives a set of values, possibly weighted by probabilities, to represent a range of likely behavior [TK93].
منابع مشابه
A fuzzy mixed-integer goal programming model for a parallel machine scheduling problem with sequence-dependent setup times and release dates
This paper presents a new mixed-integer goal programming (MIGP) model for a parallel machine scheduling problem with sequence-dependent setup times and release dates. Two objectives are considered in the model to minimize the total weighted flow time and the total weighted tardiness simultaneously. Due to the com-plexity of the above model and uncertainty involved in real-world scheduling probl...
متن کاملTrain Scheduling Problem with Consideration of Praying Constraint as an Application of Job Shop Scheduling Problem
The present paper extends the idea of job shop scheduling problem with resting constraints to the train scheduling problem with the Muslim praying considerations. For this purpose, after proposing the new mathematical model, a heuristic algorithm based on the Electromagnetism-Like algorithm (EM) which is well adjusted to scheduling problems is employed to solve the large-size practical cases. T...
متن کاملConsolidated Technique of Response Surface Methodology and Data Envelopment Analysis for setting the parameters of meta-heuristic algorithms - Case study: Production Scheduling Problem
In this study, given the sequence dependent setup times, we attempt using the technique of Response Surface Methodology (RSM) to set the parameters of the genetic algorithm (GA), which is used to optimize the scheduling problem of n job on 1 machine (n/1). It aims at finding the most suitable parameters for increasing the efficiency of the proposed algorithm. At first, a central composite d...
متن کاملAn Efficient Genetic Agorithm for Solving the Multi-Mode Resource-Constrained Project Scheduling Problem Based on Random Key Representation
In this paper, a new genetic algorithm (GA) is presented for solving the multi-mode resource-constrained project scheduling problem (MRCPSP) with minimization of project makespan as the objective subject to resource and precedence constraints. A random key and the related mode list (ML) representation scheme are used as encoding schemes and the multi-mode serial schedule generation scheme (MSSG...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999